Goto

Collaborating Authors

 intermediate structure


When does a bridge become an aeroplane?

arXiv.org Artificial Intelligence

Despite recent advances in population-based structural health monitoring (PBSHM), knowledge transfer between highly-disparate structures (i.e., heterogeneous populations) remains a challenge. It has been proposed that heterogeneous transfer may be accomplished via intermediate structures that bridge the gap in information between the structures of interest. A key aspect of the technique is the idea that by varying parameters such as material properties and geometry, one structure can be continuously morphed into another. The current work demonstrates the development of these interpolating structures, via case studies involving the parameterisation of (and transfer between) a simple, simulated 'bridge' and 'aeroplane'. The facetious question 'When is a bridge not an aeroplane?' has been previously asked in the context of predicting positive transfer based on structural similarity. While the obvious answer to this question is 'Always,' the current work demonstrates that in some cases positive transfer can be achieved between highly-disparate systems.


On the topology and geometry of population-based SHM

arXiv.org Machine Learning

Population-Based Structural Health Monitoring (PBSHM), aims to leverage information across populations of structures in order to enhance diagnostics on those with sparse data. The discipline of transfer learning provides the mechanism for this capability. One recent paper in PBSHM proposed a geometrical view in which the structures were represented as graphs in a metric "base space" with their data captured in the "total space" of a vector bundle above the graph space. This view was more suggestive than mathematically rigorous, although it did allow certain useful arguments. One bar to more rigorous analysis was the absence of a meaningful topology on the graph space, and thus no useful notion of continuity. The current paper aims to address this problem, by moving to parametric families of structures in the base space, essentially changing points in the graph space to open balls. This allows the definition of open sets in the fibre space and thus allows continuous variation between fibres. The new ideas motivate a new geometrical mechanism for transfer learning in data are transported from one fibre to an adjacent one; i.e., from one structure to another.


Balancing between the Local and Global Structures (LGS) in Graph Embedding

arXiv.org Artificial Intelligence

We present a method for balancing between the Local and Global Structures (LGS) in graph embedding, via a tunable parameter. Some embedding methods aim to capture global structures, while others attempt to preserve local neighborhoods. Few methods attempt to do both, and it is not always possible to capture well both local and global information in two dimensions, which is where most graph drawing live. The choice of using a local or a global embedding for visualization depends not only on the task but also on the structure of the underlying data, which may not be known in advance. For a given graph, LGS aims to find a good balance between the local and global structure to preserve. We evaluate the performance of LGS with synthetic and real-world datasets and our results indicate that it is competitive with the state-of-the-art methods, using established quality metrics such as stress and neighborhood preservation. We introduce a novel quality metric, cluster distance preservation, to assess intermediate structure capture. All source-code, datasets, experiments and analysis are available online.


Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring

arXiv.org Artificial Intelligence

However, it is hard to get a pretrained model as powerful as CLIP in the video Image-text pretrained models, e.g., CLIP, have shown domain due to the unaffordable demands on computation resources impressive general multi-modal knowledge learned from and the difficulty of collecting video-text data pairs large-scale image-text data pairs, thus attracting increasing as large and diverse as image-text data. Instead of directly attention for their potential to improve visual representation pursuing video-text pretrained models [17, 27], a potential learning in the video domain. In this paper, based alternative solution that benefits video downstream tasks is on the CLIP model, we revisit temporal modeling in the to transfer the knowledge in image-text pretrained models context of image-to-video knowledge transferring, which is to the video domain, which has attracted increasing attention the key point for extending image-text pretrained models to in recent years [12, 13, 26, 29, 30, 41]. the video domain. We find that current temporal modeling Extending pretrained 2D image models to the video domain mechanisms are tailored to either high-level semanticdominant is a widely-studied topic in deep learning [4, 7], and tasks (e.g., retrieval) or low-level visual patterndominant the key point lies in empowering 2D models with the capability tasks (e.g., recognition), and fail to work on the of modeling temporal dependency between video two cases simultaneously. The key difficulty lies in modeling frames while taking advantages of knowledge in the pretrained temporal dependency while taking advantage of both highlevel models. In this paper, based on CLIP [32], we revisit and low-level knowledge in CLIP model. To tackle temporal modeling in the context of image-to-video knowledge this problem, we present Spatial-Temporal Auxiliary Network transferring, and present Spatial-Temporal Auxiliary (STAN) - a simple and effective temporal modeling Network (STAN) - a new temporal modeling method that mechanism extending CLIP model to diverse video tasks. is easy and effective for extending image-text pretrained Specifically, to realize both low-level and high-level knowledge model to diverse downstream video tasks.


Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

arXiv.org Artificial Intelligence

In this paper, we consider the problem of open information extraction (OIE) for extracting entity and relation level intermediate structures from sentences in open-domain. We focus on four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and propose a unified knowledge expression form, SAOKE, to express them. We publicly release a data set which contains more than forty thousand sentences and the corresponding facts in the SAOKE format labeled by crowd-sourcing. To our knowledge, this is the largest publicly available human labeled data set for open information extraction tasks. Using this labeled SAOKE data set, we train an end-to-end neural model using the sequenceto-sequence paradigm, called Logician, to transform sentences into facts. For each sentence, different to existing algorithms which generally focus on extracting each single fact without concerning other possible facts, Logician performs a global optimization over all possible involved facts, in which facts not only compete with each other to attract the attention of words, but also cooperate to share words. An experimental study on various types of open domain relation extraction tasks reveals the consistent superiority of Logician to other states-of-the-art algorithms. The experiments verify the reasonableness of SAOKE format, the valuableness of SAOKE data set, the effectiveness of the proposed Logician model, and the feasibility of the methodology to apply end-to-end learning paradigm on supervised data sets for the challenging tasks of open information extraction.